Annotating Speech Data for Pronunciation Variation Modelling
نویسنده
چکیده
This paper describes methods for annotating recorded speech with information hypothesised to be important for the pronunciation of words in discourse context. Annotation is structured into six hierarchically ordered tiers, each tier corresponding to a segmentally defined linguistic unit. Automatic methods are used to segment and annotate the respective annotation tiers. Decision tree models trained on annotation from elicited monologue showed a phoneme error rate of 9.91%, corresponding to a 55.25% error reduction compared to using a canonical pronunciation representation from a lexicon for estimating the phonetic realisation.
منابع مشابه
Pronunciation variation modelling using decision tree induction from multiple linguistic parameters
In this paper, resources and methods for annotating speech databases with various types of linguistic information are discussed. The decision tree paradigm is explored for pronunciation variation modelling using multiple linguistic context parameters derived from the annotation. Preliminary results suggest that decision tree induction is a suitable paradigm for the task.
متن کاملImproving Automatic Phonetic Transcription of Spontaneous Speech Through Variant-Based Pronunciation Variation Modelling
In this paper we present an experiment aimed at improving automatic phonetic transcription of Dutch spontaneous speech through a variant-based method of pronunciation variation modelling. For spontaneous speech, the literature does not always provide enough rules to describe its characteristic phonological processes. Therefore, other methods should be applied to model pronunciation variation fo...
متن کاملPronunciation Variation Modelling in a Model of Human Word Recognition
Due to pronunciation variation, many insertions and deletions of phones occur in spontaneous speech. The psycholinguistic model of human speech recognition Shortlist is not well able to deal with phone insertions and deletions and is therefore not well suited for dealing with real-life input. The research presented in this paper explains how Shortlist can benefit from pronunciation variation mo...
متن کاملModelling pronunciation variations in spontaneous Mandarin speech
Pronunciation in spontaneous Mandarin speech tends to be much more variable than in read speech. In current recognition systems, pronunciation dictionaries usually only contain one standard pronunciation for each word, so that the amount of variability that can be modelled is very limited. Most recent research work for modelling variations in spontaneous speech focuses on the lexicon level, whi...
متن کاملPronunciation variation modelling using accent features
In this paper, we propose a novel method for modelling native accented speech. As an alternative to the notion of dialect, we work with the lower level phonological components of accents, which we term accent features. This provides us with a better understanding of how pronunciation varies and it allows us to give a much more detailed picture of a person’s speech. The accent features are inclu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005